Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 4600 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 647.0 KiB |
| Average record size in memory | 144.0 B |
Variable types
| NUM | 12 |
|---|---|
| CAT | 5 |
| BOOL | 1 |
country has constant value "4600" | Constant |
date has a high cardinality: 70 distinct values | High cardinality |
street has a high cardinality: 4525 distinct values | High cardinality |
statezip has a high cardinality: 77 distinct values | High cardinality |
price is highly skewed (γ1 = 24.79093256) | Skewed |
street is uniformly distributed | Uniform |
price has 49 (1.1%) zeros | Zeros |
view has 4140 (90.0%) zeros | Zeros |
sqft_basement has 2745 (59.7%) zeros | Zeros |
yr_renovated has 2735 (59.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-03-08 12:43:20.865074 |
|---|---|
| Analysis finished | 2022-03-08 12:44:04.325945 |
| Duration | 43.46 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 70 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 35.9 KiB |
| 2014-06-23 00:00:00 | 142 |
|---|---|
| 2014-06-25 00:00:00 | 131 |
| 2014-06-26 00:00:00 | 131 |
| 2014-07-08 00:00:00 | 127 |
| 2014-07-09 00:00:00 | 121 |
| Other values (65) |
| Value | Count | Frequency (%) | |
| 2014-06-23 00:00:00 | 142 | 3.1% | |
| 2014-06-25 00:00:00 | 131 | 2.8% | |
| 2014-06-26 00:00:00 | 131 | 2.8% | |
| 2014-07-08 00:00:00 | 127 | 2.8% | |
| 2014-07-09 00:00:00 | 121 | 2.6% | |
| 2014-06-24 00:00:00 | 120 | 2.6% | |
| 2014-05-20 00:00:00 | 116 | 2.5% | |
| 2014-07-01 00:00:00 | 116 | 2.5% | |
| 2014-06-17 00:00:00 | 113 | 2.5% | |
| 2014-05-28 00:00:00 | 111 | 2.4% | |
| Other values (60) | 3372 | 73.3% |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
| Distinct | 1741 |
|---|---|
| Distinct (%) | 37.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 551962.9885 |
|---|---|
| Minimum | 0 |
| Maximum | 26590000 |
| Zeros | 49 |
| Zeros (%) | 1.1% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 200000 |
| Q1 | 322875 |
| median | 460943.4615 |
| Q3 | 654962.5 |
| 95-th percentile | 1184050 |
| Maximum | 26590000 |
| Range | 26590000 |
| Interquartile range (IQR) | 332087.5 |
Descriptive statistics
| Standard deviation | 563834.7025 |
|---|---|
| Coefficient of variation (CV) | 1.021508171 |
| Kurtosis | 1044.352151 |
| Mean | 551962.9885 |
| Median Absolute Deviation (MAD) | 157500 |
| Skewness | 24.79093256 |
| Sum | 2539029747 |
| Variance | 3.179095718e+11 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 49 | 1.1% | |
| 300000 | 42 | 0.9% | |
| 400000 | 31 | 0.7% | |
| 450000 | 29 | 0.6% | |
| 440000 | 29 | 0.6% | |
| 600000 | 29 | 0.6% | |
| 350000 | 28 | 0.6% | |
| 435000 | 27 | 0.6% | |
| 250000 | 27 | 0.6% | |
| 550000 | 27 | 0.6% | |
| Other values (1731) | 4282 | 93.1% |
| Value | Count | Frequency (%) | |
| 0 | 49 | 1.1% | |
| 7800 | 1 | < 0.1% | |
| 80000 | 1 | < 0.1% | |
| 83000 | 1 | < 0.1% | |
| 83300 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 26590000 | 1 | < 0.1% | |
| 12899000 | 1 | < 0.1% | |
| 7062500 | 1 | < 0.1% | |
| 4668000 | 1 | < 0.1% | |
| 4489000 | 1 | < 0.1% |
bedrooms
Real number (ℝ≥0)
| Distinct | 10 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.400869565 |
|---|---|
| Minimum | 0 |
| Maximum | 9 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9088481155 |
|---|---|
| Coefficient of variation (CV) | 0.2672399215 |
| Kurtosis | 1.235377429 |
| Mean | 3.400869565 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.456446633 |
| Sum | 15644 |
| Variance | 0.8260048971 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3 | 2032 | 44.2% | |
| 4 | 1531 | 33.3% | |
| 2 | 566 | 12.3% | |
| 5 | 353 | 7.7% | |
| 6 | 61 | 1.3% | |
| 1 | 38 | 0.8% | |
| 7 | 14 | 0.3% | |
| 0 | 2 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 2 | < 0.1% | |
| 1 | 38 | 0.8% | |
| 2 | 566 | 12.3% | |
| 3 | 2032 | 44.2% | |
| 4 | 1531 | 33.3% |
| Value | Count | Frequency (%) | |
| 9 | 1 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 7 | 14 | 0.3% | |
| 6 | 61 | 1.3% | |
| 5 | 353 | 7.7% |
bathrooms
Real number (ℝ≥0)
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.160815217 |
|---|---|
| Minimum | 0 |
| Maximum | 8 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1.75 |
| median | 2.25 |
| Q3 | 2.5 |
| 95-th percentile | 3.5 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 0.75 |
Descriptive statistics
| Standard deviation | 0.7837810747 |
|---|---|
| Coefficient of variation (CV) | 0.3627247107 |
| Kurtosis | 1.86590471 |
| Mean | 2.160815217 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.6160327234 |
| Sum | 9939.75 |
| Variance | 0.614312773 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2.5 | 1189 | 25.8% | |
| 1 | 743 | 16.2% | |
| 1.75 | 629 | 13.7% | |
| 2 | 427 | 9.3% | |
| 2.25 | 419 | 9.1% | |
| 1.5 | 291 | 6.3% | |
| 2.75 | 276 | 6.0% | |
| 3 | 167 | 3.6% | |
| 3.5 | 162 | 3.5% | |
| 3.25 | 136 | 3.0% | |
| Other values (16) | 161 | 3.5% |
| Value | Count | Frequency (%) | |
| 0 | 2 | < 0.1% | |
| 0.75 | 17 | 0.4% | |
| 1 | 743 | 16.2% | |
| 1.25 | 3 | 0.1% | |
| 1.5 | 291 | 6.3% |
| Value | Count | Frequency (%) | |
| 8 | 1 | < 0.1% | |
| 6.75 | 1 | < 0.1% | |
| 6.5 | 1 | < 0.1% | |
| 6.25 | 2 | < 0.1% | |
| 5.75 | 1 | < 0.1% |
sqft_living
Real number (ℝ≥0)
| Distinct | 566 |
|---|---|
| Distinct (%) | 12.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2139.346957 |
|---|---|
| Minimum | 370 |
| Maximum | 13540 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 370 |
|---|---|
| 5-th percentile | 950 |
| Q1 | 1460 |
| median | 1980 |
| Q3 | 2620 |
| 95-th percentile | 3870 |
| Maximum | 13540 |
| Range | 13170 |
| Interquartile range (IQR) | 1160 |
Descriptive statistics
| Standard deviation | 963.2069158 |
|---|---|
| Coefficient of variation (CV) | 0.4502340833 |
| Kurtosis | 8.2916826 |
| Mean | 2139.346957 |
| Median Absolute Deviation (MAD) | 570 |
| Skewness | 1.723513271 |
| Sum | 9840996 |
| Variance | 927767.5626 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1940 | 32 | 0.7% | |
| 1720 | 32 | 0.7% | |
| 1660 | 31 | 0.7% | |
| 1840 | 31 | 0.7% | |
| 2000 | 30 | 0.7% | |
| 1410 | 29 | 0.6% | |
| 1200 | 28 | 0.6% | |
| 1480 | 28 | 0.6% | |
| 1490 | 27 | 0.6% | |
| 1890 | 27 | 0.6% | |
| Other values (556) | 4305 | 93.6% |
| Value | Count | Frequency (%) | |
| 370 | 1 | < 0.1% | |
| 380 | 1 | < 0.1% | |
| 420 | 1 | < 0.1% | |
| 430 | 1 | < 0.1% | |
| 490 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 13540 | 1 | < 0.1% | |
| 10040 | 1 | < 0.1% | |
| 9640 | 1 | < 0.1% | |
| 8670 | 1 | < 0.1% | |
| 8020 | 1 | < 0.1% |
sqft_lot
Real number (ℝ≥0)
| Distinct | 3113 |
|---|---|
| Distinct (%) | 67.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14852.51609 |
|---|---|
| Minimum | 638 |
| Maximum | 1074218 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 638 |
|---|---|
| 5-th percentile | 1690.8 |
| Q1 | 5000.75 |
| median | 7683 |
| Q3 | 11001.25 |
| 95-th percentile | 43560 |
| Maximum | 1074218 |
| Range | 1073580 |
| Interquartile range (IQR) | 6000.5 |
Descriptive statistics
| Standard deviation | 35884.43614 |
|---|---|
| Coefficient of variation (CV) | 2.416050987 |
| Kurtosis | 219.8729874 |
| Mean | 14852.51609 |
| Median Absolute Deviation (MAD) | 2772 |
| Skewness | 11.30713875 |
| Sum | 68321574 |
| Variance | 1287692757 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5000 | 80 | 1.7% | |
| 6000 | 65 | 1.4% | |
| 4000 | 54 | 1.2% | |
| 7200 | 50 | 1.1% | |
| 4800 | 29 | 0.6% | |
| 9600 | 25 | 0.5% | |
| 4500 | 25 | 0.5% | |
| 5500 | 23 | 0.5% | |
| 3000 | 23 | 0.5% | |
| 7500 | 23 | 0.5% | |
| Other values (3103) | 4203 | 91.4% |
| Value | Count | Frequency (%) | |
| 638 | 1 | < 0.1% | |
| 681 | 1 | < 0.1% | |
| 704 | 1 | < 0.1% | |
| 746 | 1 | < 0.1% | |
| 747 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1074218 | 1 | < 0.1% | |
| 641203 | 1 | < 0.1% | |
| 478288 | 1 | < 0.1% | |
| 435600 | 2 | < 0.1% | |
| 423838 | 1 | < 0.1% |
floors
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.512065217 |
|---|---|
| Minimum | 1 |
| Maximum | 3.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1.5 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 3.5 |
| Range | 2.5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.5382883773 |
|---|---|
| Coefficient of variation (CV) | 0.3559954763 |
| Kurtosis | -0.5388519795 |
| Mean | 1.512065217 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 0.5514406463 |
| Sum | 6955.5 |
| Variance | 0.2897543771 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 2174 | 47.3% | |
| 2 | 1811 | 39.4% | |
| 1.5 | 444 | 9.7% | |
| 3 | 128 | 2.8% | |
| 2.5 | 41 | 0.9% | |
| 3.5 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 2174 | 47.3% | |
| 1.5 | 444 | 9.7% | |
| 2 | 1811 | 39.4% | |
| 2.5 | 41 | 0.9% | |
| 3 | 128 | 2.8% |
| Value | Count | Frequency (%) | |
| 3.5 | 2 | < 0.1% | |
| 3 | 128 | 2.8% | |
| 2.5 | 41 | 0.9% | |
| 2 | 1811 | 39.4% | |
| 1.5 | 444 | 9.7% |
waterfront
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 35.9 KiB |
| 0 | |
|---|---|
| 1 | 33 |
| Value | Count | Frequency (%) | |
| 0 | 4567 | 99.3% | |
| 1 | 33 | 0.7% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2406521739 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 4140 |
| Zeros (%) | 90.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.7784047172 |
|---|---|
| Coefficient of variation (CV) | 3.234563414 |
| Kurtosis | 10.46417792 |
| Mean | 0.2406521739 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.341586381 |
| Sum | 1107 |
| Variance | 0.6059139038 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 4140 | 90.0% | |
| 2 | 205 | 4.5% | |
| 3 | 116 | 2.5% | |
| 4 | 70 | 1.5% | |
| 1 | 69 | 1.5% |
| Value | Count | Frequency (%) | |
| 0 | 4140 | 90.0% | |
| 1 | 69 | 1.5% | |
| 2 | 205 | 4.5% | |
| 3 | 116 | 2.5% | |
| 4 | 70 | 1.5% |
| Value | Count | Frequency (%) | |
| 4 | 70 | 1.5% | |
| 3 | 116 | 2.5% | |
| 2 | 205 | 4.5% | |
| 1 | 69 | 1.5% | |
| 0 | 4140 | 90.0% |
condition
Real number (ℝ≥0)
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.45173913 |
|---|---|
| Minimum | 1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 3 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.6772297676 |
|---|---|
| Coefficient of variation (CV) | 0.19619958 |
| Kurtosis | 0.1977302051 |
| Mean | 3.45173913 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.9590676635 |
| Sum | 15878 |
| Variance | 0.4586401581 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3 | 2875 | 62.5% | |
| 4 | 1252 | 27.2% | |
| 5 | 435 | 9.5% | |
| 2 | 32 | 0.7% | |
| 1 | 6 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 6 | 0.1% | |
| 2 | 32 | 0.7% | |
| 3 | 2875 | 62.5% | |
| 4 | 1252 | 27.2% | |
| 5 | 435 | 9.5% |
| Value | Count | Frequency (%) | |
| 5 | 435 | 9.5% | |
| 4 | 1252 | 27.2% | |
| 3 | 2875 | 62.5% | |
| 2 | 32 | 0.7% | |
| 1 | 6 | 0.1% |
sqft_above
Real number (ℝ≥0)
| Distinct | 511 |
|---|---|
| Distinct (%) | 11.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1827.265435 |
|---|---|
| Minimum | 370 |
| Maximum | 9410 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 370 |
|---|---|
| 5-th percentile | 860 |
| Q1 | 1190 |
| median | 1590 |
| Q3 | 2300 |
| 95-th percentile | 3440 |
| Maximum | 9410 |
| Range | 9040 |
| Interquartile range (IQR) | 1110 |
Descriptive statistics
| Standard deviation | 862.168977 |
|---|---|
| Coefficient of variation (CV) | 0.4718356515 |
| Kurtosis | 4.070138265 |
| Mean | 1827.265435 |
| Median Absolute Deviation (MAD) | 490 |
| Skewness | 1.494210748 |
| Sum | 8405421 |
| Variance | 743335.3448 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1010 | 47 | 1.0% | |
| 1200 | 47 | 1.0% | |
| 1300 | 45 | 1.0% | |
| 1140 | 44 | 1.0% | |
| 1320 | 43 | 0.9% | |
| 1150 | 42 | 0.9% | |
| 1090 | 40 | 0.9% | |
| 1180 | 40 | 0.9% | |
| 1400 | 38 | 0.8% | |
| 1050 | 37 | 0.8% | |
| Other values (501) | 4177 | 90.8% |
| Value | Count | Frequency (%) | |
| 370 | 1 | < 0.1% | |
| 380 | 1 | < 0.1% | |
| 420 | 1 | < 0.1% | |
| 430 | 1 | < 0.1% | |
| 490 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9410 | 1 | < 0.1% | |
| 8020 | 1 | < 0.1% | |
| 7680 | 1 | < 0.1% | |
| 7320 | 1 | < 0.1% | |
| 6640 | 1 | < 0.1% |
| Distinct | 207 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 312.0815217 |
|---|---|
| Minimum | 0 |
| Maximum | 4820 |
| Zeros | 2745 |
| Zeros (%) | 59.7% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 610 |
| 95-th percentile | 1210 |
| Maximum | 4820 |
| Range | 4820 |
| Interquartile range (IQR) | 610 |
Descriptive statistics
| Standard deviation | 464.1372281 |
|---|---|
| Coefficient of variation (CV) | 1.487230726 |
| Kurtosis | 4.082380024 |
| Mean | 312.0815217 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.642732192 |
| Sum | 1435575 |
| Variance | 215423.3665 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 2745 | 59.7% | |
| 500 | 53 | 1.2% | |
| 600 | 45 | 1.0% | |
| 800 | 43 | 0.9% | |
| 900 | 41 | 0.9% | |
| 700 | 38 | 0.8% | |
| 1000 | 33 | 0.7% | |
| 400 | 33 | 0.7% | |
| 550 | 27 | 0.6% | |
| 750 | 26 | 0.6% | |
| Other values (197) | 1516 | 33.0% |
| Value | Count | Frequency (%) | |
| 0 | 2745 | 59.7% | |
| 20 | 1 | < 0.1% | |
| 50 | 1 | < 0.1% | |
| 60 | 2 | < 0.1% | |
| 65 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4820 | 1 | < 0.1% | |
| 4130 | 1 | < 0.1% | |
| 2850 | 1 | < 0.1% | |
| 2730 | 1 | < 0.1% | |
| 2550 | 2 | < 0.1% |
yr_built
Real number (ℝ≥0)
| Distinct | 115 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1970.786304 |
|---|---|
| Minimum | 1900 |
| Maximum | 2014 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 1900 |
|---|---|
| 5-th percentile | 1913 |
| Q1 | 1951 |
| median | 1976 |
| Q3 | 1997 |
| 95-th percentile | 2009 |
| Maximum | 2014 |
| Range | 114 |
| Interquartile range (IQR) | 46 |
Descriptive statistics
| Standard deviation | 29.73184839 |
|---|---|
| Coefficient of variation (CV) | 0.0150862873 |
| Kurtosis | -0.6700759004 |
| Mean | 1970.786304 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | -0.50215519 |
| Sum | 9065617 |
| Variance | 883.9828087 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2006 | 111 | 2.4% | |
| 2005 | 104 | 2.3% | |
| 2007 | 93 | 2.0% | |
| 2004 | 92 | 2.0% | |
| 1978 | 90 | 2.0% | |
| 2003 | 89 | 1.9% | |
| 2008 | 89 | 1.9% | |
| 1967 | 82 | 1.8% | |
| 1977 | 80 | 1.7% | |
| 2014 | 78 | 1.7% | |
| Other values (105) | 3692 | 80.3% |
| Value | Count | Frequency (%) | |
| 1900 | 22 | 0.5% | |
| 1901 | 9 | 0.2% | |
| 1902 | 10 | 0.2% | |
| 1903 | 10 | 0.2% | |
| 1904 | 9 | 0.2% |
| Value | Count | Frequency (%) | |
| 2014 | 78 | 1.7% | |
| 2013 | 57 | 1.2% | |
| 2012 | 33 | 0.7% | |
| 2011 | 24 | 0.5% | |
| 2010 | 28 | 0.6% |
| Distinct | 60 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 808.6082609 |
|---|---|
| Minimum | 0 |
| Maximum | 2014 |
| Zeros | 2735 |
| Zeros (%) | 59.5% |
| Memory size | 35.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1999 |
| 95-th percentile | 2011 |
| Maximum | 2014 |
| Range | 2014 |
| Interquartile range (IQR) | 1999 |
Descriptive statistics
| Standard deviation | 979.4145364 |
|---|---|
| Coefficient of variation (CV) | 1.211234888 |
| Kurtosis | -1.851110913 |
| Mean | 808.6082609 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.3859187009 |
| Sum | 3719598 |
| Variance | 959252.8341 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 2735 | 59.5% | |
| 2000 | 170 | 3.7% | |
| 2003 | 151 | 3.3% | |
| 2009 | 109 | 2.4% | |
| 2001 | 109 | 2.4% | |
| 2005 | 95 | 2.1% | |
| 2004 | 77 | 1.7% | |
| 2014 | 72 | 1.6% | |
| 2006 | 68 | 1.5% | |
| 2013 | 61 | 1.3% | |
| Other values (50) | 953 | 20.7% |
| Value | Count | Frequency (%) | |
| 0 | 2735 | 59.5% | |
| 1912 | 33 | 0.7% | |
| 1913 | 1 | < 0.1% | |
| 1923 | 57 | 1.2% | |
| 1934 | 6 | 0.1% |
| Value | Count | Frequency (%) | |
| 2014 | 72 | 1.6% | |
| 2013 | 61 | 1.3% | |
| 2012 | 45 | 1.0% | |
| 2011 | 54 | 1.2% | |
| 2010 | 30 | 0.7% |
| Distinct | 4525 |
|---|---|
| Distinct (%) | 98.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 35.9 KiB |
| 2520 Mulberry Walk NE | 4 |
|---|---|
| 2500 Mulberry Walk NE | 3 |
| 2300 14th Ave S | 2 |
| 611 N 46th St | 2 |
| 21132 NE 42nd St | 2 |
| Other values (4520) |
| Value | Count | Frequency (%) | |
| 2520 Mulberry Walk NE | 4 | 0.1% | |
| 2500 Mulberry Walk NE | 3 | 0.1% | |
| 2300 14th Ave S | 2 | < 0.1% | |
| 611 N 46th St | 2 | < 0.1% | |
| 21132 NE 42nd St | 2 | < 0.1% | |
| 2008 Yale Ave E | 2 | < 0.1% | |
| 513 N 46th St | 2 | < 0.1% | |
| 9126 45th Ave SW | 2 | < 0.1% | |
| 11034 NE 26th Pl | 2 | < 0.1% | |
| 323 25th Ave S | 2 | < 0.1% | |
| Other values (4515) | 4577 | 99.5% |
Unique
| Unique | 4453 ? |
|---|---|
| Unique (%) | 96.8% |
Length
| Max length | 46 |
|---|---|
| Median length | 16 |
| Mean length | 17.01826087 |
| Min length | 8 |
city
Categorical
| Distinct | 44 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 35.9 KiB |
| Seattle | |
|---|---|
| Renton | |
| Bellevue | |
| Redmond | |
| Issaquah | 187 |
| Other values (39) |
| Value | Count | Frequency (%) | |
| Seattle | 1573 | 34.2% | |
| Renton | 293 | 6.4% | |
| Bellevue | 286 | 6.2% | |
| Redmond | 235 | 5.1% | |
| Issaquah | 187 | 4.1% | |
| Kirkland | 187 | 4.1% | |
| Kent | 185 | 4.0% | |
| Auburn | 176 | 3.8% | |
| Sammamish | 175 | 3.8% | |
| Federal Way | 148 | 3.2% | |
| Other values (34) | 1155 | 25.1% |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | 0.1% |
Length
| Max length | 19 |
|---|---|
| Median length | 7 |
| Mean length | 7.753913043 |
| Min length | 4 |
| Distinct | 77 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 35.9 KiB |
| WA 98103 | 148 |
|---|---|
| WA 98052 | 135 |
| WA 98117 | 132 |
| WA 98115 | 130 |
| WA 98006 | 110 |
| Other values (72) |
| Value | Count | Frequency (%) | |
| WA 98103 | 148 | 3.2% | |
| WA 98052 | 135 | 2.9% | |
| WA 98117 | 132 | 2.9% | |
| WA 98115 | 130 | 2.8% | |
| WA 98006 | 110 | 2.4% | |
| WA 98059 | 106 | 2.3% | |
| WA 98042 | 100 | 2.2% | |
| WA 98034 | 99 | 2.2% | |
| WA 98074 | 98 | 2.1% | |
| WA 98053 | 98 | 2.1% | |
| Other values (67) | 3444 | 74.9% |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| date | price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | sqft_above | sqft_basement | yr_built | yr_renovated | street | city | statezip | country | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2014-05-02 00:00:00 | 313000.0 | 3.0 | 1.50 | 1340 | 7912 | 1.5 | 0 | 0 | 3 | 1340 | 0 | 1955 | 2005 | 18810 Densmore Ave N | Shoreline | WA 98133 | USA |
| 1 | 2014-05-02 00:00:00 | 2384000.0 | 5.0 | 2.50 | 3650 | 9050 | 2.0 | 0 | 4 | 5 | 3370 | 280 | 1921 | 0 | 709 W Blaine St | Seattle | WA 98119 | USA |
| 2 | 2014-05-02 00:00:00 | 342000.0 | 3.0 | 2.00 | 1930 | 11947 | 1.0 | 0 | 0 | 4 | 1930 | 0 | 1966 | 0 | 26206-26214 143rd Ave SE | Kent | WA 98042 | USA |
| 3 | 2014-05-02 00:00:00 | 420000.0 | 3.0 | 2.25 | 2000 | 8030 | 1.0 | 0 | 0 | 4 | 1000 | 1000 | 1963 | 0 | 857 170th Pl NE | Bellevue | WA 98008 | USA |
| 4 | 2014-05-02 00:00:00 | 550000.0 | 4.0 | 2.50 | 1940 | 10500 | 1.0 | 0 | 0 | 4 | 1140 | 800 | 1976 | 1992 | 9105 170th Ave NE | Redmond | WA 98052 | USA |
| 5 | 2014-05-02 00:00:00 | 490000.0 | 2.0 | 1.00 | 880 | 6380 | 1.0 | 0 | 0 | 3 | 880 | 0 | 1938 | 1994 | 522 NE 88th St | Seattle | WA 98115 | USA |
| 6 | 2014-05-02 00:00:00 | 335000.0 | 2.0 | 2.00 | 1350 | 2560 | 1.0 | 0 | 0 | 3 | 1350 | 0 | 1976 | 0 | 2616 174th Ave NE | Redmond | WA 98052 | USA |
| 7 | 2014-05-02 00:00:00 | 482000.0 | 4.0 | 2.50 | 2710 | 35868 | 2.0 | 0 | 0 | 3 | 2710 | 0 | 1989 | 0 | 23762 SE 253rd Pl | Maple Valley | WA 98038 | USA |
| 8 | 2014-05-02 00:00:00 | 452500.0 | 3.0 | 2.50 | 2430 | 88426 | 1.0 | 0 | 0 | 4 | 1570 | 860 | 1985 | 0 | 46611-46625 SE 129th St | North Bend | WA 98045 | USA |
| 9 | 2014-05-02 00:00:00 | 640000.0 | 4.0 | 2.00 | 1520 | 6200 | 1.5 | 0 | 0 | 3 | 1520 | 0 | 1945 | 2010 | 6811 55th Ave NE | Seattle | WA 98115 | USA |
Last rows
| date | price | bedrooms | bathrooms | sqft_living | sqft_lot | floors | waterfront | view | condition | sqft_above | sqft_basement | yr_built | yr_renovated | street | city | statezip | country | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4590 | 2014-07-08 00:00:00 | 380680.555556 | 4.0 | 2.50 | 2620 | 8331 | 2.0 | 0 | 0 | 3 | 2620 | 0 | 1991 | 0 | 13602 SE 186th Pl | Renton | WA 98058 | USA |
| 4591 | 2014-07-08 00:00:00 | 396166.666667 | 3.0 | 1.75 | 1880 | 5752 | 1.0 | 0 | 0 | 4 | 940 | 940 | 1945 | 0 | 3529 SW Webster St | Seattle | WA 98126 | USA |
| 4592 | 2014-07-08 00:00:00 | 252980.000000 | 4.0 | 2.50 | 2530 | 8169 | 2.0 | 0 | 0 | 3 | 2530 | 0 | 1993 | 0 | 37654 18th Pl S | Federal Way | WA 98003 | USA |
| 4593 | 2014-07-08 00:00:00 | 289373.307692 | 3.0 | 2.50 | 2538 | 4600 | 2.0 | 0 | 0 | 3 | 2538 | 0 | 2013 | 1923 | 5703 Charlotte Ave SE | Auburn | WA 98092 | USA |
| 4594 | 2014-07-09 00:00:00 | 210614.285714 | 3.0 | 2.50 | 1610 | 7223 | 2.0 | 0 | 0 | 3 | 1610 | 0 | 1994 | 0 | 26306 127th Ave SE | Kent | WA 98030 | USA |
| 4595 | 2014-07-09 00:00:00 | 308166.666667 | 3.0 | 1.75 | 1510 | 6360 | 1.0 | 0 | 0 | 4 | 1510 | 0 | 1954 | 1979 | 501 N 143rd St | Seattle | WA 98133 | USA |
| 4596 | 2014-07-09 00:00:00 | 534333.333333 | 3.0 | 2.50 | 1460 | 7573 | 2.0 | 0 | 0 | 3 | 1460 | 0 | 1983 | 2009 | 14855 SE 10th Pl | Bellevue | WA 98007 | USA |
| 4597 | 2014-07-09 00:00:00 | 416904.166667 | 3.0 | 2.50 | 3010 | 7014 | 2.0 | 0 | 0 | 3 | 3010 | 0 | 2009 | 0 | 759 Ilwaco Pl NE | Renton | WA 98059 | USA |
| 4598 | 2014-07-10 00:00:00 | 203400.000000 | 4.0 | 2.00 | 2090 | 6630 | 1.0 | 0 | 0 | 3 | 1070 | 1020 | 1974 | 0 | 5148 S Creston St | Seattle | WA 98178 | USA |
| 4599 | 2014-07-10 00:00:00 | 220600.000000 | 3.0 | 2.50 | 1490 | 8102 | 2.0 | 0 | 0 | 4 | 1490 | 0 | 1990 | 0 | 18717 SE 258th St | Covington | WA 98042 | USA |